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ABSTRACT 

We allow for nonlinear effects in the likelihood analysis of galaxy peculiar velocities, and 
obtain ~ 35%-lower values for the cosmological density parameter fi m and for the amplitude of 
mass-density fluctuations <7gf^ 6 . This result is obtained under the assumption that the power 
spectrum in the linear regime is of the flat ACDM model (h = 0.65, n = 1, COBE normalized) 
with only Q m as a free parameter. Since the likelihood is driven by the nonlinear regime, we 
"break" the power spectrum at k\, ~ 0.2 (/i~ 1 Mpc) _1 and fit a power-law at k > k^. This allows 
for independent matching of the nonlinear behavior and an unbiased fit in the linear regime. 
The analysis assumes Gaussian fluctuations and errors, and a linear relation between velocity 
and density. Tests using mock catalogs that properly simulate nonlinear effects demonstrate 
that this procedure results in a reduced bias and a better fit. We find for the Mark III and SFI 
data Q m = 0.32 ±0.06 and 0.37 ±0.09 respectively, with cr 8 s2^ 6 = 0.49 ±0.06 and 0.63 ±0.08, in 
agreement with constraints from other data. The quoted 90% errors include distance errors and 
cosmic variance, for fixed values of the other parameters. The improvement in the likelihood 
due to the nonlinear correction is very significant for Mark III and moderately significant for 
SFI. 

When allowing deviations from ACDM, we find an indication for a wiggle in the power 
spectrum: an excess near k ~ 0.05 (/i _1 Mpc) _1 and a deficiency at k ~ 0.1 (/i _1 Mpc) _1 — a 
"cold flow" . This may be related to the wiggle seen in the power spectrum from redshift surveys 
and the second peak in the CMB anisotropy. 

A x 2 test applied to modes of a Principal Component Analysis (PCA) shows that the nonlinear 
procedure improves the goodness of fit and reduces a spatial gradient that was of concern in 
the purely linear analysis. The PCA allows us to address spatial features of the data and to 
evaluate and fine-tune the theoretical and error models. It demonstrates in particular that the 
models used are appropriate for the cosmological parameter estimation performed. We address 
the potential for optimal data compression using PCA. 

Subject headings: Cosmology: observations — cosmology: theory — dark matter — galaxies: 
clustering — galaxies: distances and redshifts — large-scale structure of 
universe 



1. INTRODUCTION 

Our standard cosmological framework assumes that 
structure originated from small-amplitude density 
fluctuations that were amplified by gravitational in- 
stability. These initial fluctuations are assumed to 
have a Gaussian probability distribution, fully char- 
acterized by their power spectrum P{k). On large 
scales, the fluctuations are expected to be linear even 
at late times, still characterized by the initial P(k), 
which is directly related to the cosmological param- 
eters. This makes the P(k) a useful statistic for the 
study of both, the origin of large-scale structure and 
the global cosmological parameters. 



The P(k) as estimated from galaxy redshift sur- 
veys (see reviews by Strauss & Willick 1995; Strauss 
1999) is contaminated by unknown "galaxy biasing", 
reflecting the possibility that the spatial distribution 
of galaxies is not an accurate tracer of the underly- 
ing mass distribution (e.g., recent references Blanton 
et al. 1999; Dekel &: Lahav 1999; Somerville et al. 
2001; Tegmark & Bromley 1999). Additional com- 
plications arise from redshift distortions, triple- value 
zones and the nonlinearity of the density field, which 
complicates the recovery of P(k). To avoid galaxy 
biasing, it is advantageous to estimate the mass P (k) 
directly from purely dynamical data such as the pe- 
culiar velocities. Another advantage of velocity over 
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density data is that they probe the density field on 
scales larger than the sample itself, and therefore are 
subject to weaker nonlinear effects. 

Direct estimation of the P(k) from the recon- 
structed velocity fields by POTENT-like procedures 
(Dekel et al. 1999) is complicated by the need 
to correct for the effects of large noise, smoothing, 
and sparse and nonuniform sampling (e.g., Kolatt & 
Dekel 1997; see also Park 2000). On the other hand, 
the likelihood analysis applied here, improving on the 
simplified linear version of Zaroubi et al. (1997, Z97) 
and Freudling et al. (1999, F99), acts on the 'raw' 
data without pre-processing. It utilizes much of the 
information content of the data, while taking into 
account the measurement errors and the finite, dis- 
crete sampling. The simplifying assumptions made in 
turn are that the peculiar velocities are drawn from 
a Gaussian random field, that the velocity correla- 
tions can be derived from the density P[k) using lin- 
ear theory, and that the errors are also Gaussian. 
The method requires to assume as a prior model a 
parametric functional form for the P(k), which then 
allows for cosmological parameter estimation. 

Since we address here the mass-density power spec- 
trum as derived from peculiar-velocity data, we de- 
termine directly the quantity P(k) f)^ 2 (where Q m 
is the cosmological mass-density parameter). This 
leads to a measure of a purely dynamical parame- 
ter such as osf^ 6 (where as is the rms mass-density 
fluctuation in a top-hat sphere of radius 8h _1 Mpc). 
When assuming a parametric functional form for the 
mass P(k), e.g., based on a cosmological CDM model, 
we could in principle remove the degeneracy between 
fl m and as, and determine a combination of dynami- 
cal parameters such as Q m , the baryonic contribution 
fib, the Hubble constant h, and the power index on 
large scales n [where P{k) oc k n ]. These parameters 
enter via the shape and amplitude of P(k) as well as 
the geometry and dynamics of space-time. 

Note for comparison that investigations involving 
galaxy redshift surveys commonly measure a dif- 
ferent parameter that does involve galaxy biasing, 
(3 = $7°' 6 /6 (where b is the linear biasing parameter). 
The parameters crgfi^ 6 and (3 (at 8h _1 Mpc) are re- 
lated via osg, referring to the rms fluctuation in the 
galaxy number density. Numerous measurements of 
(5 have been carried out so far, either based on red- 
shift distortions, e.g., in IRAS catalogs (Fisher et al. 
1994; Tadros 1999; Hamilton, Tegmark Sz Padman- 
abhan 2000) or based on comparisons of such redshift 
surveys and the peculiar-velocity data. Most recent 
velocity-velocity comparisons found values for (5 in 
the range 0.4 - 0.7 (Davis, Nusser & Willick 1996; 
Willick et al. 1997b; da Costa et al. 1998; Kash- 
linsky 1998; Willick & Strauss 1998; Branchini et al. 



2000) , while density-density comparisons have lead to 
values as high as 0.9 (e.g. Sigad et al. 1998). A deter- 
mination of the biasing- free quantity asfl^n directly 
from the peculiar velocity data may help clarifying 
the confusion about (3. Moreover, the direct measure 
of ri m will enable a biasing- free result. 

We use two catalogs of galaxy peculiar velocities. 
The Mark III (M3) catalog (Willick et al. 1995, 1996, 
1997a) contains ~3000 galaxies within ~70h~ 1 Mpc. 
It has been compiled from several different data sets 
of spiral and elliptical/SO galaxies with distances in- 
ferred by the forward Tully-Fisher (TF) and D n — a 
methods. The sampling is dense nearby and much 
sparser at large distances. The error per galaxy is on 
the order of 15 — 21% of the distance. The galax- 
ies were first grouped into ~ 1200 objects, ranging 
from isolated galaxies to rich clusters, in order to re- 
duce the non-linear noise and the resulting Malmquist 
bias. The data were then systematically corrected 
for Malmquist bias. The SFI catalog (Haynes et 
al. 1999a, 1999b) consists of ~ 1300 late-type spiral 
galaxies with I-band TF distances from two datasets. 
It covers a volume similar to M3, with sparser sam- 
pling nearby but a more uniform coverage of the vol- 
ume. Following da Costa et al. (1996), about 7% 
of the galaxies, those with the smallest line-width 
(\ogw < 2.25), have been discarded because of the 
unreliability of the TF relation and its scatter at such 
line- widths. The data were corrected for Malmquist 
bias using the method described in Freudling et al. 
(1999). 

In earlier papers (Z97; F99; Zehavi & Dekel 1999), 
we have applied to these data a purely linear like- 
lihood analysis. The model assumed was a linear 
P(k) on all scales, taken at large from the family 
of CDM models, normalized by COBE's measure- 
ments of the large-scale fluctuations in the CMB. The 
free parameters were typically fl m , n, and h§§ (the 
Hubble constant in units of 65 kms _1 Mpc _1 ). The 
constraints obtained from the two data sets turned 
out to be similar, both yielding a relatively high 
P(k), in general agreement with the direct estimate 
from the "POTENT" reconstruction (Kolatt k Dekel 
1997). The constraints defined an elongated two- 
dimensional surface in the Vl m -h-n parameter space, 
which could be crudely approximated in the case of 
a flat universe (and no tensor fluctuations), for M3 
and SFI respectively, by Vl m h\£ n 2 ~ 0.56 ± 0.09 
and 0.51 ± 0.10 (90% errors). Corresponding con- 
straints were a 8 fi^ 6 ~ 0.85 ± 0.11 and 0.82 ± 0.12 re- 
spectively. These results seemed to be conservatively 
consistent with the la lower bounds of Q m > 0.3 ob- 
tained from peculiar velocities by other biasing-free 
methods (Nusser & Dekel 1993; Dekel & Rees 1994; 
Bernardeau et al. 1995), but they imply higher val- 
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ues for Q m and as than obtained from other esti- 
mators, e.g. based on cluster abundance (og^^ 56 ^ 
0.57 ± 0.05, White, Efstathiou & Frenk 1993) or its 
evolution (n m ~ 0.45 ±0.25 and a 8 ~ 0.7±0.15, Eke 
et al. 1998). 

The likelihood method has been tested by Z97 and 
F99 using mock catalogs drawn from an N-body sim- 
ulation of a constrained realization of our real cos- 
mological neighborhood (Kolatt et al. 1996). These 
tests indicated that nonlinear effects may cause only 
small differences in the results. However, this sim- 
ulation was limited in an important way; it had a 
fixed dynamical resolution of only ~ 2h~ 1 Mpc, and 
therefore suffered from certain smearing of nonlinear 
effects on the scales of individual galaxies and close 
pairs. Being a simulation of an Q m = 1 cosmology 
also contributed to the underestimation of the density 
fluctuations associated with the observed peculiar ve- 
locities, compared to the currently favored cosmology 
with a lower Vt m . Despite the fact that the nonlin- 
earities are expected to be weaker for velocities than 
for densities, and that the flows are known to be rel- 
atively "cold" on small, mildly nonlinear scales, it is 
quite possible that the resolution of the early simu- 
lation was insufficient for an accurate evaluation of 
how nonlinear effects influence the results in the real 
world. 

We have therefore generated new mock catalogs 
that are based on simulations of much higher reso- 
lution (Kauffmann et al. 1999a). We simulated both 
a high-f2 m model and a low-f2 m one, and galaxies 
were identified based on a more physical semi-analytic 
scheme (Kauffmann et al. 1999a, 1999b), which al- 
lows us to better mimic the real sampling and correct 
for associated biases. 

Equipped with these nonlinear mock catalogs, we 
re-consider the nonlinear effects in our original linear 
analysis. Once we discover an indication for a bias in 
this analysis, we introduce ways to incorporate non- 
linear effects. We realize that the fit is driven by the 
small scales, because close pairs arise from nearby 
galaxies of small errors. This means that even weak 
non-linear effects on small scales may bias the results. 
We do not have yet a good analytic approximation for 
the nonlinear corrections to the velocity P(k), so we 
simply add free parameters that allow independent 
matching of the nonlinear behavior. For example, we 
introduce a break in P(k) at k^ ~ 0.2, allow an ar- 
bitrary two-parameter power-law fit in the nonlinear 
regime k > k\> and thus free the linear part of the 
spectrum at k < k^, and the associated cosmological 
parameters, to be determined unbiased. 

This nonlinear correction procedure is first tested 
using the nonlinear mock catalogs, and then applied 
to the M3 and SFI data. We find that the obtained 



values of £l m and erg are significantly lower than in the 
purely linear analysis, and the results are not sensi- 
tive to the exact way by which the nonlinear effects 
are incorporated. 

We also investigate the power spectrum in the rel- 
evant range of scales independent of a specific cos- 
mological model, by allowing as free parameters the 
actual values of P(k) in finite intervals of k (also in 
Zehavi & Knox, in preparation). In particular, this 
allows a detection of marginally significant deviations 
from the ACDM power spectrum, which can be char- 
acterized as "cold flows". 

The likelihood analysis provides only relative like- 
lihood of the different models, not an absolute good- 
ness of fit (GOF). An indication for a problem in 
the goodness of fit in the purely linear analysis came 
from a x 2 estimate in modes of a principal compo- 
nent analysis (Hoffman & Zaroubi 2000). It seems 
to be associated with a problem noticed earlier by 
Freudling et al. (1999), of a spatial gradient in the 
obtained value of Q. m . We develop a method based 
on x 2 an d PCA as a tool for evaluating the goodness 
of fit in our procedure, and find a significant improve- 
ment in the GOF when the nonlinear corrections are 
incorporated and the most noisy data are pruned. 

In § we describe the likelihood method of analysis, 
the parametric models used as priors, and the way we 
allow for nonlinear effects. In §|5]we test and calibrate 
the method using mock catalogs. In §|| we present the 
resultant power spectrum and the constraints on the 
cosmological parameters for ACDM, and detect hints 
for deviations from this model. In §[| we address the 
goodness of fit via % 2 in modes of PCA. In §|| we 
conclude. 

2. METHOD 

2.1. Likelihood Analysis 

The general method has been developed and ap- 
plied in Zaroubi et al. (1997) and Freudling et al. 
(1999) (following Kaiser 1988; Jaffe & Kaiser 1994). 
The goal is to estimate the power spectrum of mass 
density fluctuations from peculiar velocities, by find- 
ing maximum likelihood values for parameters of an 
assumed model power spectrum. Given a data set 
d, our objective is to estimate the most likely model 
parameters m. Using Bayes theorem the conditional 
probabilities are related by 



V(m\d) 



V(m)V(d\m) 



and assuming a uniform prior 'P(m), the task be- 
comes the maximization of the the likelihood func- 
tion C = 'P(d|m) as a function of the assumed model 
parameters. 
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Under the assumption that both the underlying ve- 
locities and the observational errors are independent 
Gaussian random fields, 1 the likelihood function can 
be written as 



1 



1 



[(2vr)^det(C)] 1 /2 



ex p -oE^^ 1 ' 



(2) 



h3 



This is simply the corresponding multivariate Gaus- 
sian distribution, where {ui}™ =1 is the data set of n 
observed peculiar velocities at locations {r^}, and C is 
their correlation matrix. Expressing each data point 
as the sum of the actual signal and the observational 
error U{ = Sj + n«, the elements in the correlation 
matrix have two contributions: 



C; 



(UiUj 



(SiSj) 



+ (run*) = Sij + N i: 



(3) 



The first term is the correlation matrix of the signal, 
which is calculated from the theoretical P(k) model 
at the sample positions rj. The second term is the 
error matrix, which is diagonal based on the assump- 
tion that the distance errors of the objects in the sam- 
ple are uncor related with each other. This should be 
true for the two components of the errors, the ob- 
servational errors and the intrinsic scatter of the TF 
relation. 2 

For a given P(k), the signal terms are calculated 
using their relation to the parallel and perpendicular 
velocity correlation functions, VlZy and *$>±, 

Sij = *i (r) sin 6i sin 9j + *y (r) cos 0, cos 8j , (4) 

where r = |r| = \rj — r»| and the angles are defined 
by 6i = r, • r (Gorski 1988; Groth, Juszkiewicz & Os- 
triker 1989). In linear theory, each of these can be 
calculated from the P(k), 



*U|( r ) 



2tt 2 



P(k)K ±]] (kr)dk, (5) 



where K±(x) = ji(x)/x and K\\(x) = jo — 2j%(x)/x, 
with ji(x) the spherical Bessel function of order I. 
The cosmological f2 m dependence enters as usual in 
linear theory via /(f2 m ) — ^m 6 > an d Ho is the Hubble 
constant (H = lOO/ikms-^pc -1 ). 

For each choice of the model parameters the cor- 
relation matrix C is computed, inverted, and substi- 
tuted in the likelihood function [Eq. (0)]. Exploring 
the chosen parameter space, we find the parameters 



for which the likelihood is maximized. 3 The main 
computational effort is the calculation and inversion 
of the correlation matrix C in each evaluation of the 
likelihood. It is an n x n matrix, where the number 
of data points n is typically more than 1000. This 
number is expected to increase when future samples 
become available, which will require a procedure for 
data compression (see §[|). 

The random measurement errors deserve special at- 
tention; they add in quadrature to the true P(k) and 
thus propagate into a systematic uncertainty in the 
results. Zaroubi et al. (1997) used a priori estimates 
of the errors, which were based on evaluations of the 
observational and internal scatter of the TF distances 
using galaxies in clusters or local velocity-field mod- 
els (Willick et al. 1995). Freudling et al. (1999) im- 
proved the method by incorporating the errors into 
the likelihood analysis itself via an error model with 
free parameters, which only weakly builds upon the 
original error estimates. The maximum-likelihood er- 
rors were found to be within 5% of the a priori error 
estimates, thus allowing us to adopt the a priori er- 
ror estimates in our following analysis. 

Relative confidence levels are estimated by approxi- 
mating — 21n£ as a x 2 distribution with respect to the 
model parameters. The likelihood analysis provides 
only relative likelihoods of different models. Absolute 
goodness of fit is addressed separately in §|| below. 

2.2. The Cosmological Power Spectrum Model 

In the linear regime, we use as prior the parametric 
form for the P(k) based on the general CDM model, 



P(k) = A c (n m , fi A , n) T 2 (n m , Q b , h; k) k r ' 



(6) 



where A c is the normalization factor and T(k) is the 
CDM transfer function proposed by Sugiyama (1995, 
a slight modification of Bardeen et al. 1986): 



T(k) 



ln(l + 2.34o) 

= — -x 

2.34g 

1 + 3.89g + (16.1<?) 2 + (5.46<?) 3 + (6.71q) 4 



-1/4 



(7) 



q = k [n m h exp(-n b - V2hn b /n m )] (h^MpcY 



li-l 



(8) 



lr The assumed Gaussianity of the velocity field in the mildly nonlinear regime is supported by simulations (Kofman et al. 1994, 
Kudlicki et al. 2001), and is verified by the nearly normal distribution of the observed \n(z/d) in our sample. 

2 Freudling et al. (1999) tested the impact of uncertainties in the bias correction, which might have introduced correlations in 
the errors, by varying parameters in the bias model within the expected uncertainties. They found the changes in the results to be 
negligible compared to the other systematic random errors in the analysis. 

3 Note that since the model parameters appear also in the normalizing factor of the likelihood function, through C, maximizing 
the likelihood is not equivalent to minimizing the x 2 - 
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We restrict ourselves in the present paper to the 
flat cosmological model with a cosmological constant 
(0, m + J7a = 1)> a scale-invariant power spectrum on 
large scales (n = 1), and no tensor fluctuations. The 
Hubble constant is fixed at h = 0.65 (e.g., based on 
Freedman 1997). 

As in earlier power-spectrum analyses, the bary- 
onic density is set to be Qb = 0.024/i -2 (Tytler, Fan 
&: Buries 1996), and the amplitude A c is fixed by the 



COBE 4-year data (Hinshaw et 
al. 1998), 



al. 1996; Gorski et 



\ogA c = 7.84 - 8.33ft m + 21.31ft 
10.65*4 + 15.42f4 - 6.04S1 
8.61^1 . 



29.67fi^+ 
13.97^L + 



(9) 



In order to check the sensitivity to uncertainties 
in these quantities, we have repeated the likelihood 
analysis with a later estimate of Vt^ = 0.019/i -2 
(Buries et al. 1999) and an alternative COBE nor- 
malization (Bunn & White 1997, equations 25 and 
29) . The obtained values of J7 m are found to be very 
robust to these changes, with variations smaller than 



2.3. Broken Power Spectrum 



As will be demonstrated below (§|5j), the maximum- 
likelihood solution is driven by the small scales, 
k > 0.2 (/i _1 Mpc) _1 , because close pairs preferen- 
tially consist of nearby galaxies for which the errors 
are typically small. In the case where the same pa- 
rameters determine the P(k) on all scales, this means 
that even small inaccuracies in the power spectrum 
shape at large wave-numbers may bias the results at 
small wave- numbers, and therefore the value of O m . 
An accurate nonlinear correction to the linear velocity 
power spectrum could have been very useful in avoid- 
ing this bias, but, unfortunately, such a correction is 
not yet available. As mentioned earlier, a successful 
empirical approximation does exist for the nonlinear 
correction to the density P(k) (Peacock & Dodds 
1996, PD), modeling a gradual deviation from the 
linear P(k) at k > 0.2 (/i _1 Mpc) _1 , but the general- 
ization to a velocity correction is not straightforward 
because there is no explicit exact relation between 
velocity and density in the nonlinear regime. 

The procedure adopted here is to detach the non- 
linear regime from the linear regime by introducing 
a "break" in the power spectrum at a wavenumber 
k\y. We then assume the ACDM shape for the P(k) 
at k < kb, determined by physical free parameters 



such as fi m , and allow an almost arbitrary function 
with enough free parameters to fit the data at k > A^. 
We try, for example, a power law, with two free pa- 
rameters: P(k) = Bk~ s . This power-law serves the 
purpose of feeding the "likelihood monster" residing 
in the nonlinear regime, while freeing the linear part 
of the spectrum, and the associated cosmological pa- 
rameters, to be determined unbiased. The break scale 
could be an additional free parameter; we test below 
the robustness of the results to the actual choice of 
k h . 

This approach can be carried to an extreme where 
we break P(k) in several places, and fit arbitrary 
functional forms independently within finite intervals 
of k. Once we do that, we allow for more flexibility 
in the power-spectrum shape, and can detect specific 
deviations from the predicted CDM shape. Naturally, 
this procedure would limit our ability to address cos- 
mological parameters. The choice of a series of inde- 
pendent step functions (or "band powers", forming 
a histogram) is especially appealing computationally, 
because in this case the correlation matrix becomes 
a simple linear combination of the correlation matri- 
ces of the individual segments, and then the integrals 
entering the correlation matrix need to be computed 
only once. 5 

Alternatively, we can repeat the procedure of 
Freudling et al. (1999) (also used in Bridle et al. 
2001) where the nonlinear effects are accounted for by 
adding to the linear velocity correlation model a free 
parameter of uncorrelated velocity dispersion at zero 
lag, <t v , e.g., representing small-scale random virial 
motions. This is a different way of incorporating non- 
linear effects, possibly with a different physical inter- 
pretation. It would thus be interesting to see how 
the different methods affect the results in the linear 
regime, and whether both are needed. 

It is not clear that such prescriptions will capture 
the exact shape of the power spectrum in the nonlin- 
ear regime, and this is not our purpose here. How- 
ever, we find that they may be sufficient for more ac- 
curate estimation of the power spectrum in the linear 
regime, and for an unbiased determination of the cos- 
mological density parameter under an assumed para- 
metric shape for the linear power spectrum. 

3. TESTING THE METHOD 

3.1. Mock catalogs 

We test the method using artificial mock cata- 
logs based on a cosmological simulation, in which 
the "true" cosmological parameters and linear power 



4 First attempts in this direction are made by Sheth, Zehavi & Diaferio (2000). 

5 A way to implement band powers with a large number of free parameters is via an iterative quadratic estimator scheme, com- 
monly applied to CMB measurements {e.g., Bond, Jaffe & Knox 1998), which greatly improves the computational efficiency and 
simultaneously provides the cross correlation between the different bands (Zehavi & Knox in preparation). 
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spectrum are fully known a priori and where nonlin- 
ear effects are simulated with adequate accuracy on 
galactic scales. We use the unconstrained "GIF" sim- 
ulation (Kauffmann et al. 1999a) of the flat ACDM 
cosmology with fi m = 0.3. The initial fluctuations in 
this simulation were Gaussian, adiabatic and scale- 
invariant, n = 1. The P{k) shape parameter was 
r = Vt m h = 0.21 (namely h = 0.7) and the amplitude 
is such that <7g = 0.9 (extrapolated by linear the- 
ory from the initial conditions to z = 0), consistent 
with both the present cluster abundance and COBE's 
measurements on large scales. The iV-body code is a 
version of the adaptive particle-particle particle-mesh 
(AP 3 M) Hydra code developed as part of the VIRGO 
supercomputing project (Jenkins et al. 1998). The 
simulation has 256 3 particles and 512 3 cells, and a 
gravitational softening length of 30h _1 kpc, inside a 
box of side 141.3 h _1 Mpc. 

Dark-matter halos were identified using a friends- 
of-friends algorithm with a linking length of 0.2 (cor- 
responding to a density contrast of ~ 125 at the halo 
edges) and a minimum of 10 particles per halo was 
imposed. Luminous galaxies were planted in the ha- 
los based on a semi-analytic scheme (Kauffmann et 
al. 1999a, 1999b) whose main elements can be sum- 
marized as follows. A merger history is constructed 
for each halo. The gas in every progenitor halo is 
assumed to cool radiatively and settle into a galactic 
disk. Stars are assumed to form in a rate proportional 
to the mass of cold gas and inversely proportional 
to the dynamical time. Cold gas may be re-heated 
by supernovae feedback and removed from the disk 
or from the halo all together. When halos merge, 
the central galaxy of the largest halo becomes the 
central galaxy of the new halo and all other galax- 
ies become satellites which later merge with the cen- 
tral galaxy on a dynamical-friction time scale. Major 
mergers result in destruction of disks and formation of 
spheroids, thus determining the morphological type. 
The star formation history of each galaxy is convolved 
with stellar population synthesis models and extinc- 
tion models to obtain total luminosities in different 
bands. 

We assigned to each galaxy a linewidth based on 
the TF relation and scatter assumed in Kolatt et 
al. (1996), and a diameter based on the magnitude- 
diameter relation used in Freudling et al. (1995). 
We then generated 10 mock catalogs which resemble 
the M3 catalog and 10 which resemble the SFI sam- 
ple. The selection procedure was simulated using the 
galaxy magnitudes (M3) or angular diameters (SFI) 
generated above, in a way that reproduces the red- 
shift and luminosity distributions in the real catalogs. 
The samples were further randomly diluted, simulat- 
ing selection by other independent properties such as 



inclination, to match the number of galaxies in the 
real catalogs. This random sampling, along with the 
random distance errors (introduced by the TF scat- 
ter), has been repeated 10 times to generate the 10 
mock catalogs. The mock data were corrected for 
Malmquist biases exactly the way they were corrected 
in the real data, including the first step of grouping 
for the M3 samples. 

The degree of nonlinearity , which is a key feature in 
our current analysis, depends on the exclusion of clus- 
tered galaxies from the sample. In M3, rich clusters 
of either elliptical or spiral galaxies were selected a 
priori and considered as single massive objects mov- 
ing with their center of mass velocities. The typi- 
cal radii of rich clusters are crudely ~ 3h _1 Mpc and 
~ 6h~ 1 Mpc for ellipticals and spirals respectively. 
Since the M3 catalog is densely sampled at small dis- 
tances, and it includes elliptical galaxies which tend 
to cluster, further groups were identified from the 
"field" samples and were also treated as single ob- 
jects. In the SFI catalog, which consists only of spi- 
rals, the cluster galaxies were excluded a priori and 
included in an associated catalog, termed SCI; there 
was no need for further grouping of the relatively 
sparse field sample. Unfortunately, the exclusion and 
grouping of clustered galaxies in the two real catalogs 
did not follow a simple uniform and objective algo- 
rithm that is straightforward to mimic in the simula- 
tions. 

We therefore produced a suite of 10 sets of 10 mock 
catalogs each, spanning a range of degree of nonlin- 
earity, created by varying the criterion for the ex- 
clusion of cluster galaxies. Galaxies were excluded if 
they lie within a distance r c from the cluster center. 
The "linearity parameter" r c is measured in units of 
3.5h _1 Mpc and 1.5h _1 Mpc for spirals and ellipticals 
respectively (the radii used in the old mock catalogs 
of Kolatt et al. 1996), and it ranges from r c = 0.1 
to 1 in steps of 0.1. This allows us to study how our 
method performs in the presence of different degrees 
of nonlinear effects. 

3.2. Bias in Q, m and its Correction 

We have applied the likelihood analysis to each of 
the 10 x 10 mock M3 catalogs. The recovered val- 
ues of Q m are shown in Figure [l] as a function of 
the degree of linearity of the dataset, as measured 
by r c . The "true" target value is J7 m = 0.3. Each 
symbol represents the average over the 10 mock cat- 
alogs for each value of r c . The small errorbars mark 
the standard deviation over these 10 mock catalogs, 
and thus represent the uncertainty due to the ran- 
dom sampling and the random distance errors. The 
large errorbars are 84% errors based on the likelihood 
analysis, namely determined by AlnC = 1; they thus 
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Fig. 1. — Testing the method at different levels of nonlinearity. Bottom: the value of the recovered density parameter 
fl m as a function of the degree of linearity of the dataset, as measured by r c . The "true" value is Cl m = 0.3. Each symbol 
marks the average over 10 mock M3 catalogs, and the inner error-bar marks the corresponding standard deviation. The 
outer error-bar is the 84% likelihood uncertainty (corresponding to Aln£ = 1), which includes cosmic variance. The 
squares represent the results of the likelihood analysis using the purely linear ACDM P(k); they show a significant bias 
that increases with decreasing r c . The triangles represent the results of the improved analysis using a broken- ACDM 
P(k); the bias is drastically reduced. Top: the corresponding mean improvement in log£ for the nonlinear analysis versus 
the linear analysis. 



include the effects of random distance errors and cos- 
mic variance. 

We first apply the purely linear analysis, with the 
linear ACDM power spectrum at all scales, COBE 
normalized, and with 0, m as the only free parame- 
ter while all the other parameters are fixed at their 
"true" values. We see that the linear likelihood anal- 
ysis systematically overestimates the value of Q m . As 
the data become more nonlinear, the recovered value 
of Q m becomes higher, and the bias more significant. 
For example, at r c = 0.2, we obtain Q m = 0.53, which 
is more than a 4<r deviation from the true value. 

Next, we apply the improved procedure, allowing 
for a break in the P(k) at k h = 0.2 (Tr^Mpc) -1 
and two additional free parameters in the nonlinear 
regime. As demonstrated in Figure [l], the bias is 
practically removed for all levels of nonlinearity. The 
figure also shows the corresponding improvement in 
log£ when the linear analysis is replaced with the 
nonlinear analysis. The improvement grows continu- 
ously with decreasing r c , from Aln£ ~ 1 at r c = 1 to 



~ 120 at r c = 0.1. 

Figure |2| shows the average mass-density power 
spectra recovered from the M3 mock catalogs of lin- 
earity r c = 0.2. The true linear density P(k) of the 
simulation, moved forward in time by linear theory, 
is shown for comparison. Also shown is the Peacock 
& Dodds (1996) nonlinear correction at large k. The 
P{k) recovered by the linear likelihood analysis (L), 
corresponding to f2 m = 0.53, is higher than the true 
P(k) at k = 0.2 (/T^Mpc)" 1 by a factor of ~ 5. The 
nonlinear analysis (NL), with O m = 0.34 compared 
to the true Q m = 0.30, brings the P(k) down much 
closer to the true P(k). 

The power-law segment at k > k\, crudely recovers 
the amplitude of the PD power spectrum at k ~ 1, 
but apparently not the general slope. A good agree- 
ment between the two is not obvious a priori because 
the PD correction refers to the density P(k), while 
our likelihood analysis is based on the velocity power 
spectrum and is still making use of the linear relation 
between velocity and density. We shall see below that 
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Fig. 2. — Mean power spectra recovered from the M3 mock catalogs of r c = 0.2. The target is the true linear P(k) 
(marked "True"). The nonlinear correction by Peacock & Dodds (PD) is shown for comparison. The result from the 
linear analysis is marked "L". The result from the nonlinear analysis with fcb = 0.2 (/i _1 Mpc) , marked "NL", consists 
of a ACDM function at fc < fcb and a power-law at fc > fcb- The P(k) is in units of (/i _1 Mpc) 3 . 



when applied to the real data, of either M3 or SFI, 
the NL segment does match the PD approximation 
somewhat better. 

The true initial power spectrum in the simulation 
was actually based on the T functional form (e.g., 
Efstathiou, Bond & White 1992), which is a slightly 
different approximation to the ACDM spectrum than 
the one used as a prior in our likelihood analysis, 
Eq. (0). The differences between these two power 
spectra for the same values of £l m h are relatively 
small, e.g., at the level of 20% at k = 0.1. To test 
the robustness of our results to these small differ- 
ences in the power-spectrum shape, we also applied 
the same likelihood analysis to the mock catalogs us- 
ing the r model as prior. The free parameter in this 
case was T, which is equivalent to Q. m for a fixed 
h. The normalization of f(Q m )P(k) was fixed at a 
small wavenumber, k = 0.001( h _1 Mpc)~ 1 , to equal 
the true normalization in the simulation. The results 
are found to be robust. For example, at a linear- 
ity of r c = 0.2, the linear T model yields a best fit 
of O m = 0.54 (instead of 0.53 when Eq. (0) is used, 
with COBE normalization), and the broken T model 
yields Q m = 0.36 (instead of 0.34). 



When nonlinear effects are in action, one might 
worry about mode-mode coupling affecting the re- 
sults in the linear regime. The unbiased estimates of 
the linear P(k) and Q, m obtained with our method 
when applied to the mock catalogs indicates that 
the modes for k < fcb ar e practically decoupled from 
modes with fc > fcb. We can assume that the linear 
modes with k < fcb are not coupled among them- 
selves, but we do not know much about possible cou- 
pling among the nonlinear modes with fc > fcb- This 
is fine as long as we do not assign physical significance 
to the shape of the power spectrum recovered on these 
scales, beyond assuming that the overall probability 
distribution is well approximated by a model with a 
certain function playing the role of the power spec- 
trum. 

Our conclusion from the above test using the mock 
catalogs is that, in the presence of significant non- 
linear effects in the data, the purely linear likelihood 
analysis might yield a biased estimate of P(k) and 
Q m . The broken-P(fc) analysis successfully eliminates 
the dependence of the results on the nonlinear effects 
and practically corrects the bias in the results. 
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4. RESULTS 
4.1. Broken ACDM: the Value ofQ m 

We now apply the improved likelihood analysis to 
the real data of M3 and SFI. Similar to the tests 
with the mock data, our ACDM model is restricted 
to a flat universe with h = 0.65, f^/i 2 = 0.02 and 
n = 1, leaving only one cosmological parameter free 
to be determined by the maximum likelihood analy- 
sis, namely Q m . Note that contrary to the situation 
in the mock catalogs we now do not know a priori 
that the ACDM model is the right one or that the 
values of the fixed parameters are the accurate ones. 
The purely linear analysis yields Q m = 0.56 ± 0.04 
and O m = 0.51 ± 0.05 for M3 and SFI respectively 
(90% errors), consistent with Z97 and F99 when h 
and n are fixed at the values quoted above. Based 
on the test using mock catalogs, we now suspect that 
these might be overestimates. 

Figure |3| shows the maximum-likelihood power 
spectra as derived from the two catalogs of real data. 
The linear analysis yields a high amplitude, corre- 
sponding to a high value of fc pea k where P(k) is at 
maximum, and the corresponding high value of D, m . 

The nonlinear analysis with a break at k^ = 
0.2 (/i _1 Mpc) _1 yields a shift of /c pea k towards lower k 
values, associated with a lower value of Q m , and a cor- 
responding lower amplitude for P(k) in much of the 
linear regime. The new values are VL m = 0.32 ± 0.06 
for M3 and VL m = 0.37 ± 0.09 for SFI. 

The corresponding best-fit values of cs^m 6 are 
0.49 ± 0.06 and 0.63 ± 0.08 for M3 and SFI respec- 
tively. These values are consistent with the estimates 
from cluster abundance {e.g., Eke et al. 1998). 

The change caused in the value of J7 m due to the 
nonlinear correction is similar to the corresponding 
change in the mock catalogs of M3 at a relatively 
high degree of nonlinear ity, r c ~ 0.2 in Figure [l] 

The best-fit power-law segments in the nonlinear 
regime are 145/c -0 ' 8 for M3 and 155/c -1,4 for SFI 
(where k is in units of (/i _1 Mpc) _1 , and P(k) in units 
of (/i _1 Mpc) 3 . The power-law segments roughly co- 
incide with the linear ACDM segments at k^, indi- 
cating that this broken power spectrum is a sensible 
approximation to the actual shape of the P(k). Inter- 
estingly, the best-fit power laws match quite closely 
the PD nonlinear corrections for the density P(k). 
This match should not be taken as a very meaningful 
estimate of the actual shape of the power spectrum 
in the nonlinear regime. Our nonlinear segment re- 
sults from applying a procedure that is still based on 
the linear gravitational instability relation between 
velocity and density to a phenomenon that is fully 
nonlinear. We have no immediate reason to expect 
the PD formalism to extend to peculiar velocities so 



straightforwardly. We therefore take this match only 
as a crude guideline to encourage further investiga- 
tions into the behaviour of the velocity power spec- 
trum (and the likelihood function) in the mildly non- 
linear regime. 

As expected, the nonlinear correction for the M3 
catalog is larger than for the SFI data, because the 
former has more galaxies nearby and therefore a 
larger number of close pairs with small errors. The 
M3 P(k), which was somewhat higher than SFI in the 
linear analysis, becomes somewhat lower than SFI as 
a result of the nonlinear analysis, but the two cata- 
logs basically yield consistent results. The likelihood 
improvement for M3 is very significant, Aln£ ~ 22, 
while for SFI it is moderately significant, Aln£ ~ 2.8. 

Figure shows the likelihood CIS £1 function of Q m 
for each of the two datasets, where for each value 
of Q m , the power-law parameters in the nonlinear 
regime obtain their most likely values. The maxi- 
mum is narrower for M3 than for SFI. When, instead, 
we marginalize over the two power-law parameters in 
the nonlinear regime, the obtained best Q m remains 
the same (to within 0.01) and the likelihood function 
becomes wider by 4%. 

Although we may expect the position of the break 
in the power spectrum, k^, to be in the vicinity 
of k ~ 0.2 (e.g., from the PD approximation), we 
should check the robustness of our results to the ac- 
tual choice of k^. Figure |5| shows the derived values 
of Q, m and the corresponding likelihood as a function 
of the value of k^. We find that the results are quite 
insensitive to the choice of k^ over a wide range. At 
&b values much smaller than 0.1, corresponding to 
large separations between pairs of objects and thus 
involving mostly distant objects of large errors, there 
are insufficient data to constrain the power spectrum, 
and therefore the errors become big and the results 
quite meaningless. At very large values of k\,, the 
analysis is expected to recover the results of the old 
linear analysis with no break. It indeed does so, but 
only when k^ approaches the artificial cutoff applied 
to P(k) arbitrarily at fe max = 8 (/i Mpc) for the 
purpose of finite numerical integration. It seems that 
any little freedom allowed in the model beyond the 
strict linear power spectrum is enough for correcting 
the bias associated with the linear analysis. 

As mentioned earlier, an alternative way to incor- 
porate nonlinear effects is by adding to the linear ve- 
locity correlation model a free parameter of uncorre- 
cted velocity dispersion at zero lag, a v . When this 
nonlinear correction is applied by itself to the M3 
data, the best value of fl m becomes 0.38 (instead of 
0.56 in the linear analysis) with o~ v = 250 km s -1 . 

We then apply to M3 the two different nonlinear 
corrections together, i.e., a break in the power spec- 
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Fig. 3. — The recovered power spectra from the real data of M3 (left) and SFI (right). The P(k) yielded by the purely 
linear analysis is marked "L", while the nonlinear analysis, with a break at k = 0.2 (/i _1 Mpc) _1 , is marked "NL". Also 
shown for comparison is an extrapolation of the linear part of the recovered P(k) into the nonlinear regime by the PD 
approximation. The P{k) is in units of (/i _1 Mpc) 3 . 
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Fig. 4. — Likelihood function for the values of f2 m due to the nonlinear analysis with k^ = 0.2 (h 1 Mpc) , from the 
real data of M3 (left) and SFI (right). 



trum at k = 0.2 as well as a free velocity dispersion 
term. Figure ^ shows a map of the resulting like- 
lihood in the Q m -a v parameter plane. The best-fit 
value of cr v is close to zero, indicating that the two 
nonlinear corrections are practically redundant. 

4.2. Deviations from ACDM 

Encouraged by the success of breaking the power 
spectrum into two detached segments, we now push 
the idea further, and divide the power spectrum into 4 
detached segments. This allows a more general shape 
for P(k), less dependent of a priori assumptions 



about a physical model such as ACDM. By doing so 
we may detect clues for deviations from the "stan- 
dard" P(k) shape, but in this case we clearly give up 
the attempt to determine cosmological parameters. 

Our 4-band model for P(k) consists of the following 
segments: 

1. COBE- normalized ACDM in the extreme linear 
regime, k < 0.02, with Q m fixed at the most 
likely value from the nonlinear analysis. 

2. A free constant amplitude in the interval 0.02 < 
k < 0.07, at the vicinity of fe pea k. 
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Fig. 5. — Robustness to the break scale. The results of the nonlinear analysis with the broken-ACDMpower spectrum, 
for the real data, as a function of the break length k\> in units of (A Mpc) . The errors are 84% from the likelihood 
function. Bottom: the recovered value of Q m . Top: the corresponding likelihood. Left: M3. Right: SFI. 



3. An independent free constant amplitude in the 
interval 0.07 < k < 0.2, just short of the transi- 
tion between the linear and nonlinear regimes. 

4. A power law with two free parameters (as be- 
fore) in the nonlinear regime, k > 0.2. 

Figure [7| shows the recovered 4-band P(k) 
from the real data of M3 and SFI, in compar- 
ison with the ACDM results of the linear and 
nonlinear analysis discussed above. The most 
likely parameters of the three meaningful segments 
(k > 0.02) are (9750, 370, 160A;- - 95 ) for M3 and 
(7450, 1000, 160AT 1 - 40 ) for SFI, where the amplitudes 
are in unites of (/i _1 Mpc) 3 and k is in units of 
(h M.pc) . The errors are shown in the figure. 

The nonlinear segment, not surprisingly, practically 
recovers the results of the broken- ACDM analysis. 
The most linear segment can be fixed almost arbi- 
trarily without affecting the results in the two inner 
bands. This is not surprising either since we can- 
not expect our current data to contstrain the power 
spectrum on these very large scales. 

The second segment in the linear regime, 
(0.02, 0.07), tends to lie above the peak obtained 
in the broken- ACDM analysis, while the third lin- 



ear interval, (0.07, 0.2), in the "blue" side of the 
peak, shows a low amplitude. We thus see a "wig- 
gle" about the linear portion of the broken-ACDM 
best fit, which is detected independently in the two 
data sets. The features in the linear regime con- 
tribute only a marginal improvement to the overall 
likelihood, which is still dominated by the nonlinear 
segment. The significances of the deviations, both 
the excessive peak and the low dip, are slightly above 
la each. Our finding should therefore be considered 
as a marginal hint only; it could be just a fluke due 
to the distance errors and cosmic variance. But it is 
still an intriguing feature, especially since it appears 
consistently in our two samples. 

The marginal deviation from the broken-ACDM 
P(k) thus consists of a wiggle, with a power excess 
near the peak, k ~ 0.05, and a deficiency at k ~ 0.1. 
The missing power is reminiscent of the indications 
for "cold flow" in the galaxy peculiar velocity field 
in the local cosmological neighborhood. While the 
streaming motions on scales of a few tens of mega- 
parsecs are on the order of several hundreds of kilome- 
ters per second, the dispersion velocity of field galax- 
ies is only on the order of ~ 200 km s -1 , indicating a 
high Mach number on comparable scales (e.g., Suto, 
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Fig. 6. — Robustness to different nonlinear corrections. The likelihood, for the real M3 data, when allowing both a 
break at k = 0.2 and a velocity-dispersion term, as a function of the free parameters VL m and a v . The contours correspond 
to 68%, 95.4% and 99.73% in the two-parameter plane. 
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Fig. 7. — The 4-band power spectrum for the M3 (left) and SFI (right) data, compared to the best-fit ACDM power 
spectra, linear (L) and nonlinear with a break (NL). 



Cen & Ostriker 1992; Chiu, Ostriker & Strauss 1998; 
Dekel 2000). 

A hint for a similar wiggly feature has been de- 
tected in the density P(k) as derived by some of the 
researchers from the distribution of galaxies (Baugh 



& Gaztanaga 1998; Landy et al. 1996) and clusters 
(Einasto et al. 1997; Suhhonenko & Gramann 1999). 
Most recently, there are indications for such a wiggle 
in the preliminary P(k) derived from part of the 2dF 
redshift survey (private communication with the 2dF 
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team) . 

Most interestingly, the scale of the missing power 
in our local P(k) from velocities roughly coincides 
with the scale of the second peak in the angular 
spectrum of the CMB. Preliminary balloon measure- 
ments (Boomerang, de Bernardis et al. 2000; Max- 
ima, Hanany et al. 2000) indicate that this peak is 
somewhat lower than expected by the common CDM 
models. This may be a reflection of the same phe- 
nomena which we detect here as "cold flow" in the 
peculiar velocity data. 

The scale of the wiggly feature roughly coincides 
with the most obvious physical scale in cosmology 
- the size of the cosmological horizon at the time 
of transition from radiation to matter dominance, or 
slightly later, at the epoch of plasma recombination 
and radiation-matter decoupling. A wiggle on these 
scales can be produced by an excess of either baryons 
or massive neutrinos in the cosmological mass budget. 
But the excess required to produce a significant wig- 
gle seems to violate upper limits from other data; the 
density of baryons is limited by He+D abundances via 
the theory of Big-Bang nucleosynthesis (Tytler et al. 
2000) , and the density of neutrinos is constrained by 
large-scale structure (e.g., Ma 1999; Gawiser 2000). 

5. PRINCIPAL COMPONENT ANALYSIS 

The linear ACDM analysis of both the M3 and SFI 
data (Z97; F99) revealed a warning signal concern- 
ing the GOF, which we termed "the two-halves prob- 
lem" . When the linear analysis is applied separately 
to two halves of the data, separated either by the 
median distance or by line-width (which is correlated 
with the distance), the results are somewhat differ- 
ent. The distant data prefer a lower-amplitude power 
spectrum, associated with a lower value of Q. m . The 
mock catalogs (tested with the true correlation ma- 
trix) have not revealed a similar problem, indicating 
that it is caused by inadequacies of the correlation 
matrix compared with the real data. These shortcom- 
ings may be associated with the assumed theoretical 
model, either the shape of the P(k) or the Gaussian 
nature of the fluctuations, or with the error model, 
either its Gaussianity or its radial dependence. These 
worries motivate an attempt to evaluate the GOF in 
our linear and nonlinear analyses. In particular, we 
wish to see to what extent the revised P(k) in the 
nonlinear regime may resolve the two-halves problem. 

Assume a data vector d, which is a random re- 
alization of an n-dimensional multivariate Gaussian 
distribution, with the correlation matrix C = (dd'). 
A global GOF could be evaluated using the x 2 statis- 



tic, x 2 = dtC-M = Tr{C- l D), where D = ddt. 
If C is the true correlation matrix, then this quan- 
tity should obey a x 2 distribution with n degrees of 
freedom, as indeed is the case for all the analyses we 
performed. But this single number cannot capture 
all the particulars of the fitting process. 

A Principal-Component Analysis, in which the 
data are represented in terms of the eigenvector basis 
of the (assumed) correlation matrix, is a powerful tool 
for our purposes in several different ways. Our orig- 
inal motivation for applying a PCA (following Voge- 
ley & Szalay 1996; Tegmark, Taylor & Heavens 1997) 
was to allow optimal compression of the data into the 
modes that are most important for estimating the pa- 
rameters we wish to evaluate, with the aim to reduce 
the computational cost associated with inverting huge 
correlation matrices, and to improve the results given 
an inaccurate correlation matrix. In the current pa- 
per, we use a PCA for two other purposes. First, we 
present a novel approach of using the PCA modes 
for identifying certain gross features of the data and 
model via the correlation matrix. Second, we extend 
a method first used by Hoffman & Zaroubi (2000) 
for evaluating GOF in fine detail 6 and for trying to 
resolve the two-halves problem. 



5.1. Modes of S + N versus S/N 

The standard PCA is as follows. A general coor- 
dinate transfromation of the data d defines a new 
set of m random variables d = ^d, where ^ is 
an m x n matrix, which we assume to be of full 
rank. It is then clear that the distribution of the new 
variables is still Gaussian, with a correlation matrix 
C = (ddt) = ^c^ 1 ". The likelihood analysis can be 
performed in terms of these new "data" points. If we 
keep all the original information, i.e., if ^ is invert- 
ible, the likelihood analysis is not affected because 
then TriC^D) = Tr(C _1 D). Since the correlation 
matrix is symmetric and positive definite, we can, 
without loss of generality, pick the matrix such that 
C is diagonal. Then the rows di of \I> are the eigenvec- 
tors of the correlation matrix, or its principal compo- 
nents, and the diagonal terms Aj of C are the corre- 
sponding eigenvalues, In statistical terms this means 
that the new variables are expected to be uncorre- 
lated. The validity of this independence of variables 
is a measure of GOF. We thus use the x 2 statistic to 
test the hypothesis that di / are uncorrelated unit 
Gaussian random variables. 7 If this test uncovers sys- 
tematic effects, it may become possible to associate 

particular case of cumulative % 2 P er degree of freedom for S + N 
modes. 



In the terminology which is introduced later, they used the 
modes, and we extend the analysis to differential \ 2 an d to S/N 

7 Another advantage of having the modes uncorrelated is that, when compressing the data, it makes sense to have no correlation 
between the data kept and the data eliminated. 
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Fig. 8. — Average distances associated with the eigenmodes of the correlation matrix of the linear ACDM model. The 
eigenmodes are ranked by decreasing eigenvalue amplitude, (low m — high eigenvalue). The standard deviation is shown 
for every tenth mode. Left: S/N modes. Right: S + N modes. Top: M3 data. Bottom: SFI data. 



them with certain features of the data and model via 
a further investigation of the eigenmodes. 

The eigenmodes are ordered by the amplitude of 
their eigenvalues, from large to small, and the high- 
eigenvalue modes are assigned a higher signifcance, 
because the confidence levels in the recovered pa- 
rameters of a maximum-likelihood analysis inversly 
correlate with the squares of the eigenvalues of the 
modes used (Tegmark, Taylor & Heavens 1997). Fur- 
theremore, perturbation analysis implies that small- 
eigenvalue modes are more sensitive to perturbations 
in the correlation matrix, implying that the mode-by- 
mode statistical tests may not be reliable for small- 
eigenvalue modes. Since our correlation matrix is 
expected to be only an approximation to the true 
correlation matrix, it would be advantageous, in gen- 
eral, to avoid small-eigenvalue modes, and rely on the 
high-eigenvalue ones. 

A straightforward application of PC A is with the 
original correlation matrix of Eq. (||) , which is a sum 
of signal and noise: C = S+N. In this case, the large 



eigenvalues may correspond either to large signal, or 
large noise, or both. Another possibility, which we 
term S/N, is to first perform a "whitening" trans- 
formation, d = N~^d (Vogeley & Szalay 1996). In 
the case of a diagonal noise matrix N, this transfor- 
mation amounts to normalizing the data in terms of 
the expected noise. The new correlation matrix is 
C = N~zSN~2 + /, where / is the identity matrix. 
The eigenvalues of C are the signal-to-noise ratios of 
the corresponding principal modes. 8 



5.2. Correlation Between Mode and Distance 

The eigenmodes can help us identify certain fea- 
tures of the data and models. In particular, the sig- 
nal part involves the geometry of sampling and the 
prior model, and the noise part involves the distance 
error estimates. A useful diagnostic statistic to assign 
with each eigenmode is the distance from the Local 

8 It seems, therefore, that the S/N modes can provide a proper basis for optimal data compression. As said before, if the models 
for the signal and errors are perfectly accurate, then the truncation would not affect the result while the estimated uncertainties 
will grow. However, if the correlation matrix is only approximate, using the high-S/N part of the data may improve the results. In 
particular, since the correlation matrix is quadratic in the data, an error in the error model would necessarily lead to a systematic 
error in the results. By eliminating the law-S/N modes such systematics may be reduced. 
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Group. For eigenvector v, the average distance is 



( r )v = \ v (9)\ 2 r(g) , 



(10) 



where the sum is over the sample of galaxies, r(g) is 
the distance of galaxy g, and v(g) defines the vector 
v in the basis g. This is a weighted average of the 
galaxy distances. The variance ((r — (r) v ) 2 ) v is de- 
fined in analogy. If the standard deviation is small 
compared to the average distance, one can conclude 
that most of the information associated with this 
mode comes from galaxies within a certain distance 
range. 

The distance associated with a mode provides im- 
portant information about the mode: distant modes 
are typically noisier than nearby ones because the 
distance error is proportional to distance. The cor- 
relation between modes and distance could also help 
us understand the two-halves problem. In the follow- 
ing mode-by-mode analysis, we use this correlation to 
interpret a correlation with mode number as a corre- 
lation with distance. 

Figure || shows the average distance for each mode, 
for the linear ACDM model and either the M3 or 
SFI data. We see that the S/N modes are correlated 
with distance, such that the high-eigenvalue modes, 
which are robust and of high signal-to-noise ratio, 
are typically associated with nearby data, which are 
of relatively small errors. On the other hand, these 
nearby modes tend to involve close galaxy pairs, and 
are therefore more subject to nonlinear effects, which 
makes the nonlinear correction a must. The correla- 
tion is strong for M3, and weaker for SFI. 

The S + N modes show a somewhat stronger cor- 
relation in the opposite sense, in which the high- 
eigenvalue modes, except for the first few, are typ- 
ically associated with large distances and therefore 
noisy data. This means that most of the S+N modes 
are dominated by the noise rather than the signal. 
This situation is unfortunate for the S + N PCA; for 
example, it does not allow a sensible truncation by 
S + N modes. But it should allow a more sensitive 
measure of GOF, refering in particular to the error 
model. Again, the correlation is stronger for M3 than 
for SFI. 

5.3. Goodness of Fit Mode by Mode 

After PCA, assuming that we know the true cor- 
relation matrix, the variables d are expected to be 
uncorrelated, and we expect xf = dj/Xi to be about 
unity for each and every mode separately. The valid- 
ity of this behavior mode by mode provides an im- 
proved and finer test of GOF in two ways. First, it 
tests whether the eigenmodes of the prior correlation 



matrix are really uncorrelated, with the variance de- 
termined by the eigenvalues. Then, in the case of a 
poor fit for a certain mode, it can guide us to the 
source of the poor fit via the properties associated 
with that mode. 

One statistic we use, for each mode number m, is 
the cumulative x 2 P er degree of freedom, Y^ILi Xi/ m i 
in which the sum starts from the high-eigenvalue 
modes and ends at mode m. In the case of inde- 
pendet modes, the expected value is unity, and the 
expected standard deviation is about y/2/m (the nor- 
malized standard deviation of a x 2 distribution with 
m degrees of freedom) . 

A second set of more localized statistics is the 
differential estimates, obtained by averaging the x 2 
values over intervals of mode numbers of length s 
namely, using J2i^=m-s+iXi / s f° r various values of 
m. If the correlation matrix was exact, these would 
be independent (assuming the intervals are disjoint) 
and follow a x 2 distribution with s degrees of free- 
dom. In particular, the expectation value would be 
unity with a standard deviation ~ \J2/ s. We choose 
s = 50, which is large enough for good statistics and 
small enough with respect to the total n for the pur- 
pose of tracing systematic effects. 

Figure ^ shows the cumulative x 2 statistic 
function of m for the linear ACDM analysis and 
for the nonlinear broken-ACDM analysis. Figure 1C 
shows the corresponding differential x 2 statistic. 

For the S/N modes of M3, the GOF of the lin- 
ear model is marginal, in the sense that the typical 
deviations in the cumulative statistic are at the 2a 
level. The low-m modes, except for the very first 
ones, typically have low x 2 /dof values, while the 
large- m modes have high values. This systematic 
behavior with m can be translated to a systematic 
correlation with distance via the correlation between 
distance and mode (Figure ||). It is therefore a remi- 
niscence of the two-halves problem. 

Can we use the PCA to distinguish between an in- 
adequate model of P(k) and a problem in the error 
model? We see in Figure ^ that the broken-ACDM 
model clearly improves the GOF as far as the S/N 
modes of M3 are concerned. With this model, the cu- 
mulative x 2 /dof lies well inside the 2a contours for 
all the modes, with no apparent systematic depen- 
dence on m. It implies that the broken-ACDM P(k) 
is a more appropriate model for the data. Based on 
the S/N modes, there is no indication that the error 
model may be inadequate. 

When we analyze the S + N modes in a similar 
way in Figure ^, the linear model, for both data sets, 
shows a more severe deviation of x 2 /dof from unity, 
at the 4 — 5a level, and a similar systematic depen- 
dence on m. This trend is also apparent in Figure [LQ 
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(middle panels). The two-halves problem is very ob- 
vious here, with the more distant data, correspond- 
ing to larger eigenvalues and larger noise, favoring a 
smaller amplitude for the power spectrum than the 
nearby data. In this case, the use of the better, 
broken-ACDM model makes only a small improve- 
ment which does not resolve the problem. This is a 
clear indication that something may be wrong in the 
error model as well. 

We then recall that the low-m S + N modes are 
associated with large distances, where the errors are 
large and are known to a lesser accuracy. Guided 
by Figure [8|, we try a poor-man compression of the 
data by eliminating from the analysis all the data 
points that lie at an inferred distance greater than 
60h _1 Mpc. This leaves us with 843 out of the 1124 
(grouped) data points of M3, and 996 out of the 1156 
galaxies of SFI. This truncation makes only a negligi- 
ble change in the the best-fit value of Q m (an increase 
of less than 3%, both for M3 and SFI), and it causes 
only a minor widening of the likelihood contours. In 
the case of M3, we see in Figure |9] that the S + N 
modes of the linear model and truncated data show 
an improved GOF compared to the case of the whole 
data, but the x 2 /dof still show ~ 3a deviations from 
unity and a systematic dependence on m. However, 



the S + N modes of the broken-ACDM model and 
truncated M3 data now do lie within the 2a contours. 
The systematic trend with m is still apparent, indi- 
cating that the correlation matrix is still not perfect; 
either the error model is only an approximation even 
for the truncated data, or the broken-ACDM P(k) is 
not yet a perfect model (as seen in §||), or the signal 
and/or the noise are not exactly Gaussian. 

In the case of SFI, while the S/N modes look very 
adequate with both models, for the S + N cumu- 
lative statistic the improvements due to the nonlin- 
ear correction and the elimination of large-distance 
galaxies are apparently not enough for an acceptable 
GOF. According to the differential statistic, the non- 
linear correction and truncation do bring each of the 
first few bins into their 2a range, but the fact that 
many of these bins are each not much above the —2a 
line makes the cumulative statistic lie outside its 2a 
range. Since the large-eigenvalue S+N modes, which 
dominate the cumulative statistic, are dominated by 
noise, the limited GOF is likely to point at further 
shortcomings of the error model for SFI. 

6. CONCLUSION 

A likelihood analysis is supposed to recover unbi- 
ased values for the free parameters of a model pro- 
vided that the prior theoretical model and the error 
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model allow accurate description of the data. We 
addressed here tools to recover the parameters given 
incomplete knowledge of these models. 

Using mock catalogs based on high-resolution sim- 
ulations, we realized that the likelihood analysis of 
peculiar-velocity data, based on the linear ACDM 
power spectrum, is driven by the nonlinear part of 
the spectrum which is not modeled accurately, and 
might therefore yield biased results. For example, in 
the linear analysis of the M3 mock data, the obtained 
amplitude of P(/c)0^ 2 is overestimated, correspond- 
ing to a positive bias in the cosmological density pa- 
rameter Q m by ~ 35%. 

A broken-ACDM power spectrum, in which the 
k > fcb segment of the power spectrum is replaced 
by a more flexible two-parameter power law, allows 
a better, independent fit in the nonlinear regime. It 
then frees the linear part of the spectrum from non- 
linear effects, and yields unbiased results for O m . The 
results are robust to the specific choice of fct,; we 
choose &b = 0.2 (/i -1 Mpc) -1 , which is where the non- 
linear density P(k) is expected to start deviating from 
the linear P(k) by the PD approximation. The results 
are also robust to the exact way by which the nonlin- 
ear effects are incorporated. When we add a zero-lag 



velocity dispersion term to the correlation function, 
either replacing the break in the power spectrum or 
in addition to it, the results are similar. 

We note that these procedures do not mean to 
tell us much about the actual power spectrum in 
the nonlinear regime, as our improved analysis in 
the nonlinear regime is still based on the linear rela- 
tion betweem velocity and density. For an improve- 
ment in this direction, one should come up with a 
physically motivated functional form that accounts 
for mildly nonlinear corrections to the linear velocity 
power spectrum, in analogy to the PD correction to 
the density P(k). Nevertheless, as demonstrated by 
the tests based on mock catalogs and by the robust- 
ness to the actual nonlinear correction applied, our 
current procedures are reliable for eliminating the bi- 
ases in the linear regime. 

When applied to the real data of M3 or SFI pe- 
culiar velocities, assuming a flat ACDM model with 
n = 1 and h = 0.65, the improved analysis yields 
best-fits of n m = 0.32 ± 0.06 and 0.37 ± 0.09 re- 
spectively, corresponding to cgil^ 6 ~ 0.49 ±0.06 and 
0.63 ± 0.08 respectively. These values are in good 
agreement with most constraints from other data, 
including CMB anisotropics and cluster abundance 



9 Joint analyses of peculiar velocities with other dynamical data free of galaxy biasing were pursued before based on the linear 
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(e.g., Bahcall et al. 1999). 9 It is important to stress 
that the estimates of the cosmological parameters re- 
ported here are valid under the assumption that the 
power spectrum in the liner regime is drawn from 
the flat ACDM cosmolgical model with Q m as a free 
parameter. In our current nonlinear analysis of the 
data inside 60 h _1 Mpc, the maximum- likelihood solu- 
tion based on the ACDM power spectrum indeed has 
an acceptable goodness-of-fit, making our estimate 
of £l m self-consistent and meaningful. Although we 
conclude below that our error model is not accurate 
for M3 beyond 60h _1 Mpc and for SFI, the obtained 
value of il m is insensitive to the exclusion of the noisy 
data beyond 60h^ 1 Mpc and to other changes in the 
error model. 

By allowing a more general shape for the power 
spectrum, with 4 detached segments, we detect a 
marginal indication for a deviation from the ACDM 
power spectrum. The possible deviation is char- 
acterized by a wiggle, with an enhanced ampli- 
tude near fc pe ajs ~ 0.05 and a depletion near k ~ 
0.1 (/i _1 Mpc) _1 . The study of possible deviations 
from the ACDM model is done at the expense of 
the attempt to estimate tl m , which requires a power- 
spectrum shape based on a physical cosmological 
model. Nevertheless, we learn from the 4-band analy- 
sis that the relatively low Q m estimate in the broken- 
ACDM analysis is driven by the low amplitude of the 
power spectrum near k ~ 0.1 (/i _1 Mpc) _1 . 

The indicated "cold flow" on a scale of a few tens 
of megaparsecs is reminiscent of similar indications 
from the power spectrum of galaxies and clusters in 
redshift surveys (§|5|). Most recent is the wiggle seen 
in the preliminary power spectrum derived from the 
2dF redshift survey. The local cold flow may be re- 
lated to the second peak in the CMB angular power 
spectrum on a similar scale. The wiggly feature in 
the power spectrum may be interpreted as a possi- 
ble indication of a deviation from the standard cos- 
mological mass mixture, e.g., a higher baryonic con- 
tent than indicated by the Deuterium abundance and 
Big-Bang nucleosynthesis, or a non-negligible contri- 
bution from hot dark-matter in the form of massive 
neutrinos. However, the possibility that this feature 
is a statistical fluke due to cosmic variance in the 
context of the ACDM model cannot be ruled out yet. 

A principal component analysis, either in S/N or 
S + N modes, allows a fine test of goodness of fit, 
by applying a x 2 test mode by mode. It shows that 
the broken-ACDM model is a better fit to the data 
than the purely linear ACDM model. For M3, using 
the "whitened" S/N modes, the nonlinear correction 
is enough to eliminate the "two-halves" problem that 
troubled the linear analysis. This indicates that the 



ACDM model is a good model for the signal. When 
the S + N modes are analyzed, the correction to the 
theoretical model is helpful but not enough for an 
acceptable GOF. When the M3 data is further trun- 
cated at 60h _1 Mpc, eliminating distant galaxies for 
which the errors are large and the error model is inac- 
curate, the GOF becomes acceptable. This strength- 
ens the reliability of our parameter estimates based 
on the assumed ACDM model in the linear regime. 
Future investigations should refine the error model 
at large distances, in order to possibly reduce the 
cosmic variance in our estimates. For SFI, the S/N 
modes seem adequate, confirming again the suitabil- 
ity of the ACDM model in the linear regime, but the 
S + N PCA indicates that the errors in this catalog 
are still more complex than assumed. 

The PCA is a powerful tool for addressing inter- 
esting properties of the data and its relation to the 
best-fit theoretical and error models. In particular, 
we associated each mode with a geometric property 

- the mean distance and the variance about it - 
and thus learned about the correlation between mode 
eigenvalues and distance errors. This was useful in 
the study of GOF and in truncating the data to deal 
with inaccuracies in the error model. The PCA will 
be extremely useful when one tries to compress the 
data while keeping the optimal part for determining 
a specific desired parameter. This compression may 
be mandatory for computational reasons when the 
body of data is excessively large. Since the model 
is expected to be incomplete, either in terms of the 
theoretical assumptions or the errors, a proper com- 
pression of the data may in fact improve the results. 
Such data compression using PCA in the context of 
the analysis of cosmic flows is in progress. 

Our conclusions can be summarized at three lev- 
els, as follows. First, if one is willing to assume the 
"standard" ACDM power-spectrum shape in the lin- 
ear regime, then the nonlinear corrections yield an 
unbiased value of f2 m ~ 0.35, consistently from the 
two data sets. The GOF is acceptable for this non- 
linear analysis and the cosmological model and er- 
rors assumed, once the M3 data inside 60h~ 1 Mpc 
are considered (and the value of fl m is insensitive to 
the inclusion of the other, noisy data). This makes 
the parameter estimation self-consistent and mean- 
ingful, and this is our strongest result. Second, once 
we alleviate the requirement of a physically motivated 
power spectrum and its dependence on cosmological 
parameters, we detect a marginal indication for a wig- 
gle about the ACDM power spectrum in the linear 
regime. This is a marginal detection, which does not 
ivalidate the suitability of the ACDM model for the 
cosmological parameter estimation performed above. 



analysis (Zehavi & Dekel 1999), and now based on the nonlinear analysis (Bridle et al. 2001). 
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The hint for a wiggle is still intriguing because it re- 
peats in the two data sets, it seems to coincide with 
similar clues from other data, and it may have inter- 
esting theoretical implications on the nature of dark 
matter and the initial fluctuations. Third, we learn 
from our newly developed PCA analysis that in order 
to possibly impprove the statistics one should refine 
the error model for the most distant galaxies in M3, 
and for a larger fraction of the galaxies in SFI. But 
our parameter estimates and the marginal detection 
of a wiggle are driven by the part of the data for 



which the error model yields an acceptable goodness 
of fit. 
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